Transform sparse array matrix code to perform run time dependency ^ \ \ 0 
check 



Software-pipeline the transformed sparse array matrix code 
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Form a pr determined number of variables based on a 
virtual unrolling factor M 
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Initialize the formed predetermined number of variables 
by creating variables bO, b1, ... bM-1 and aO, a1, ... aM- 
1 and initializing bO, b1, ... bM-1 to an illegal value for 
the b array, say to-1 
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Load prior computed values such that b[i] is loaded into 
the variable bM and the value of c[i] is loaded into the 
variable cM 
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Assign the prior computed values, inside the loop body, ^-24-0 
to a predetermined number of registers r ^ 
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Cycle 0:bJ»load b[ij 

c3 * load c(ij 
Cycle I- 

Cycle 2 compute &{a[b(t]|) 

i.mp nc p\,p4 - b3 t b2 
Cycle 3: alokl -■ afb|i]J 
Cvclr.4;fn)Hrnn.ni' urn: nl.nt- hi bt 
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Cycle 5' (pi }cmp nc unc pi ,p2« b3,b0 

C>ek6- 
Cycle 7 

Cycle8. 
C>clcO. 
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Cycle 10: (p2)a3 = aO* c3 

Cycle II: (p3)a3« B | + c3 
Cycle 12 (p!)ii3«33old*cJ 



Cycle 13 (p4j a3 ~ u2 » c3 
Cycle 14 



Cycle 15 

Cycle 16 
Cycle \ 7 

Cycle IS 
Cycle 19. 



Cycle 20 



Cycle 21 store a[b(ij|- a 3 
Cycle 22. i 

Cycle 23 
Cycle 2A. 
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Stage j 

Cycle 0: M - load b[ij 
c3 - load c(i] 

(- -Cycle I: 

Cycle 2: compute &(a[b(i]]) 
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cmp.nepl t p4«=b3,b2 
j.Cycle3:a3o!d=afb(iJJ 

Cycle 4: (pi ) cmp.ne.unc pi ,p3~ b3,bl 
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Stage 2 

Cycle 5: (pi) cmp ne.unc pl.p2= b3,b0 

Cycle 6: 
Cycle 7: 

Cycle 8: 
Cycle 9: 

Stage. 1 

Cycle 10:(p2)a3«aO + c3 

Cycle 11: (p3)a3 = al +c3 
Cycle 12:(pl)a3 = a3ofd+c3 
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